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Abstract — It is proved that in a non-Bayesian parametric 
estimation problem, if the Fisher information matrix (FIM) is 
singular, unbiased estimators for the unknown parameter will 
not exist [1], [2], [3]. Cramer-Rao bound (CRB), a popular tool 
to lower bound the variances of unbiased estimators, seems 
inapplicable in such situations. In this paper, we show that 
the Moore-Penrose generalized inverse of a singular FIM can 
be interpreted as the CRB corresponding to the minimum 
variance among all choices of minimum constraint functions. 
This result ensures the logical validity of applying the Moore- 
Penrose generalized inverse of an FIM as the covariance lower 
bound when the FIM is singular. Furthermore, the result can be 
applied as a performance bound on the joint design of constraint 
functions and unbiased estimators. 

Index Terms — Cramer-Rao bound (CRB), constrained param- 
eters, singular Fisher information matrix (FIM). 



I. Introduction 

An interpretation of the Moore-Penrose generalized inverse 
of a singular information matrix is presented in this paper, 
from the perspective of Cramer-Rao bound (CRB). CRB is 
a lower bound on the covariance matrix of any unbiased 
estimator in a non-Bayesian parametric estimation problem 
[4], [5], and is a popular tool to evaluate the optimal mean- 
square error (MSE) performance of estimators in various 
applications [6], [7]. The most general form of CRB says 
that the covariance matrix of any unbiased estimator is lower 
bounded by the generalized inverse of the Fisher information 
matrix (FIM) [8]. This general form of CRB holds for both 
singular and non-singular FIMs. 

There are, however, facts in literature that renders the 
application of CRB questionable when the FIM is singular 
Rothenberg proves in [1] that under some regularity con- 
ditions, the non-singularity of the FIM is equivalent to the 
local identifiability of the parameter to be estimated'; Stoica 
et al. prove in [2], [3] that unbiased estimators with finite 
variances do not exist when the FIM is singular, except for 
some "unusual" conditions^. If the parameter to be estimated 
is locally non-identifiable, or all of the unbiased estimators 
will have infinite variances, it seems meaningless to discuss 
the performance of unbiased estimators. 

As mentioned in [9], one may change the nature of an 
estimation problem to allow the existence of reasonable 
estimators. There are three approaches. The first approach 

' A parameter 9 is locally identifiable if there exists an open neighbourhood 
of such that no other 0' g is observationally equivalent to 0. 

'The "unusual" conditions suggest that if the FIM is singular, only unbiased 
estimators for some functions of the unknown parameter, instead of the 
unknown parameter itself, may exist with finite variances. 



is to introduce a priori information about the probability 
distribution of the parameter to be estimated; in this way 
the estimation problem becomes a Bayesian one. There are 
abundant literature on Bayesian statistics [10] and performance 
bounds [11]. A priori information about the probability dis- 
tribution of the unknown parameter, however, is not always 
already known. The second approach is to consider biased 
estimators instead of unbiased estimators. In [9], the necessary 
condition for the bias function to ensure the existence of 
reasonable estimators is derived. The authors of a recent paper 
derive the bias function that leads to the minimum trace of 
the resulting CRB, a lower bound on the total variance of 
estimators [12]. There are a number of situations, however, 
where biased estimators are not preferred. For example, al- 
most all estimation problems encountered in the design of 
a communication system, including the estimation of carrier 
phase and symbol timing for synchronization, the estimation 
of channel response for equalization, etc., require unbiased 
estimators. The third approach is to put or to exploit some 
deterministic constraints on the parameter to be estimated. 
The deterministic constraints result in a parametric estimation 
problem with reduced dimension, where reasonable unbiased 
estimators may exist. This paper focuses on the third approach. 

Take blind channel estimation problems for example [13]. 
The goal of blind channel estimation is to estimate the channel 
response h from y — s * h + n, the convolution of the 
channel response h and the input data sequence s, corrupted 
by an additive noise n. The unknown parameter 9 — {s,h) is 
not identifiable since {as, ^h) and (s, h) ai'e observationally 
equivalent for any constant a 7^ 0, so unbiased estimators do 
not exist. Practically this so-called scalar ambiguity problem is 
resolved by assigning a pre-determined value to one of the ele- 
ment of s [14]. That is, a constraint function f{0) = Sn—c = 
is put on the parameter 6, where s„ denotes the nth element 
of s and c is some pre-determined constant. This is exactly 
the third approach mentioned above. 

CRB for constrained parameters is already derived in [3], 
[15], [16]. The value of the constrained CRB depends on the 
choice of the constraint function; different constraint functions 
lead to different values of the CRB. This bound is useful 
when the constraint function is exogenously given, but there 
are situations where we are able to modify the constraint 
function. Take blind channel estimation problems for example 
again. Suppose an engineer chooses the constraint function 
as f{6) = si — c = and designs an unbiased estimator 
corresponding to this constraint function, and finds the re- 
sulted MSE, although almost achieving the constrained CRB 
with respect to the constraint function, is still unsatisfactory 



compared with the target value. How can the engineer tell the 
unsatisfactory result is caused by the inappropriate choice of 
the constraint function, or simply because the target value is 
not attainable for any choice of the constraint function? 

The main contribution of this paper is the following the- 
orem. The Moore-Penrose generalized inverse of a singular 
FIM is the constrained CRB corresponding to the minimum 
variance among all choices of minimum constraint functions. 
According to the theorem, the logical validity of using the 
Moore-Penrose generalized inverse of a singular FIM as a 
covariance lower bound for unbiased estimators is justified, 
and a CRB for the joint design of the unbiased estimator 
and the constraint function is obtained. In addition to a 
performance bound, we also derive the sufficient condition 
for a constraint function to achieve the bound, which is a 
linear function of the parameter to be estimated. The above 
results facilitates future research on the optimal joint design 
of constraint functions and unbiased estimators. 

A mathematical definition of minimum constraint functions 
will be given in Section IV- A, but the meaning is conceptually 
easy to understand. In blind channel estimation problems, only 
a one-dimensional constraint on is needed to resolve the 
scalar ambiguity, such as f{0) =- s„ — 1, any constraint 
function / that is essentially a one-dimensional constraint is a 
minimum constraint function as long as the constrained CRB 
exists. 

The rest of the paper is organized as follows. The necessary 
background knowledge is given in II. Then we show that 
the Moore-Penrose generalized inverse [17] of the FIM can 
be viewed as a CRB for constrained parameters with some 
constraint function in Section III. Section IV is divided into 
two sub-sections. In the first sub-section we give the definition 
of minimum constraint functions and justify its meaning. In the 
second sub-section we prove the main result of this paper, that 
the Moore-Penrose generalized inverse of the FIM is the CRB 
corresponding to the minimum variance among all choices of 
constraint functions in Section IV. Conclusions are presented 
in Section V. 

Notation 

Bold-faced lower case letters represent column vectors, and 
bold-faced upper case letters are matrices. Superscripts such 
as V* , v^ , M^^, and Af^ denote the conjugate, transpose, 
inverse, and the Moore-Penrose generalized inverse of the 
corresponding vector or matrix. The vector E [v] denotes 
the expectation of the random vector v, and the E [M] the 
expectation of the random matrix M. The matrix cov(ti, v) 
is defined as cov{u,v) = E [{u — E{u)){v — E{v))'^'\, which 
is the cross-covariance matrix of random vectors u and v. We 
use the notation ^ > B to mean that A — B is a nonnegative- 
definite matrix. The notation rankAf denotes the rank of the 
matrix M. 

II. Preliminaries 

In this section, some background knowledge required to 
begin the discussions in the following sections are presented. 
We restrict our attention to the case of unbiased estimators 



for the unknown parameter, so the theorems presented in this 
section may be a simplified version of that on the original 
papers. 

When we refer to the CRB for unconstrained parameters, 
we mean the following theorem. 

Theorem II.l (CRB for unconstrained parameters). Let be 
an unbiased estimator of an unknown parameter 6 £ R" based 
on observation y, which is characterized by its probability 
density function (pdf) p{y;6). Then for any such 6, we have 



co\/(e,e) > J\ 



(1) 



(2) 



where J is the FIM defined as 

dhip dlnp 

Proof: See [8]. ■ 

The above theorem is always correct given that unbiased 
estimators exist. Stoica et al, however, proved the following 
theorem in [9]. 

Theorem II.2. If the information matrix J is singular, then 
there does not exist an unbiased estimator with finite variance. 

Proof See [9]^ ■ 

That is, there does not exist any reasonable unbiased esti- 
mator 6 if the FIM is singular, so the CRB fails to provide 
any useful information. 

When we refer to the CRB for constrained parameters, we 
mean the following theorem. 

Theorem II.3 (CRB for constrained parameters). Let 9 be an 
unbiased estimator of an unknown parameter 9 £ M" based 
on observation y, which is characterized by its pdf p{y;9). 
Furthermore, we require the parameter 9 to satisfy a possibly 
nonlinear constraint function f : M" -^ M™, m < n, 

f{9) = 0. (3) 

Assume that df/d9^ is full rank. Choose a matrix U with 
(n — m) orthonormal columns such that 

df 



09- 



,U = 0. 



If U'^ JU is nonsingular, then 

GOV (^, 9^>U (U^JU) "^ U^, 
where J is the FIM defined as in (2). 



(4) 



(5) 



Proof: See [3]. ■ 

The following theorem gives a necessary and sufficient 
condition for the existence of a finite constrained CRB. 

Theorem II.4. The constrained CRB is finite if and only if 

the matrix U^ JU is non-singular 

Proof See [3]. ■ 

Now we are able to discuss the relationship between the 
Moore-Penrose generalized inverse of an FIM and constrained 
CRB. 

^When we restrict our attention to unbiased estimators for tlie unknown 
parameter only, tlte condition for the existence of an unbiased estimator with 
finite variance in [9] becomes JJ^ = I, which is impossible for singular 
FIMs. 



III. J^ AS A CRB FOR Constrained Parameters 
The main result of this section is the following theorem. 

Theorem III.l. If the FIM J is singular, and let the singular 
value decomposition (SVD) of J be 



J^[Us [/„ ] 



s 








(6) 



the diagonal elements of S being nonzero, then J^ is a CRB 
for constrained parameters with constraint function 



u^e 



c = o 



(7) 



fid) 

for some constant matrix C. 

To prove the theorem, we first prove the following lemma. 

Lemma III.l. Let the SVD of a hermitian matrix J be the 
same as in (6). Then 

J^ =Us{U^JUsy'uJ. (8) 

Proof: Substitute J as J = Us'SUj^ into (8). ■ 

Now we are able to prove Theorem III.l. 

Proof for Theorem III.l: By examining the lemma and 
Theorem II. 3, we can think of J^ as a constrained CRB with 
some constraint function f{9) such that 

df 



de^ 



[/, = 0. 



(9) 



Since U^Us = by the definition of SVD, a constraint 
function / that satisfies (9) can be chosen such that 

df 



86^ 



U^ 



(10) 



The above equation can be satisfied by a linear constraint 
function, 

f{0) = u^e + c = o, (11) 



and the theorem is proved. ■ 

TV. Interpretation of J^ as a CRB for Constrained 
Parameters 

In this section we prove that J^ is not only a CRB for 
constrained parameters, but the CRB corresponding to the 
minimum variance among all choices of minimum constraint 
functions. We first give a definition minimum constraint func- 
tions, and then prove the claim. 

A. Definition of Minimum Constraint Functions 

Minimum constraint functions are defined as follows. 

Definition IV.l. A differentiable constraint function f : M" — > 
R™, m < n, for a non-Bayesian parametric estimation 
problem with a singular FIM J is a minimum constraint if 

1) dj jdQ^ is full rank, 

2) U^ JU is nonsingular, and 

3) rank df/dO^ + rank J = n, 
where U is chosen as in Theorem II. 3. 

The first requirement is to ensure that / does not contain any 
redundant constraints [15], [16]. The second requirement is to 



ensure the existence of CRB according to Theorem II.4. The 
third requirement means that / contains the minimum number 
of independent constraints. Take blind channel estimation 
problems as example. From discussions in Section I we know 
that once we choose one symbol as a pilot symbol with some 
predefined value, we eliminate the scalar ambiguity and thus 
an unbiased estimator exists. Note that the nullity of the FIM 
is also one [18], [19]. We can see the third requirement holds. 
Now we give a formal proof that if the first two requirements 
are satisfied, then the third requirement ensures that / contains 
the minimum number of independent constraints. 

Theorem IV.l. For any constraint function f in Definition 
IV.l that satisfies the first and the second requirements. 



mm rank —-^ 



rank J. 



(12) 



Proof: First we show that in order to satisfy the first and 
the second requirements, 

df 
rank — ttf > n - rank J, (13) 

and then we show that the equality is achievable. 
If 

df 
rank -TT^jT < n - rank J, (14) 

by the definition of U (see Theorem II. 3), [/ is a n-by- 
(rank U) matrix with 



> rank U > rank J. 



(15) 



By the fact that 



rank U JU < minjrank U, rank J} < rank J < rank U, 

(16) 
where the last inequality follows by (15), and noting that 
U^ JU is a (rank [/)-by-(rank U) square matrix, U^ JU 
cannot be full -rank. Thus (13) is proved. 

The achievability of equality in (13) is easy to prove. Choose 
the constraint function / as in (7), and we can see such 
a constraint function satisfies all of the requirements of a 
minimum constraint function. ■ 

By the above theorem we can see the third requirement is 
in fact requiring df /dO^ to have the minimum rank. The 
reason why such a constraint function / can be considered as 
the constraint function with minimum constraints can be found 
by the following theorem. 

Theorem IV.2. Let A C R" be open and let f : A ^ MT 
be a differentiable function such that df/dO has rank m 
whenever f{x) ~ 0. Then /^^(O) defines an (n — m)- 
dimensional manifold in M". 

Proof See [20]. ■ 

Constraint functions / with the minimum rank df jdO'^ 

ensures that the resulting manifolds have the maximal degree 

of freedom, so we call them as constraint functions with the 

minimum constraints. 



B. J^ is the CRB corresponding to the minimum variance 
among all choices of minimum constraint functions 

This subsection is to prove the claim that jt is the CRB 
corresponding to the minimum variance among all choices 
of minimum constraint functions. For convenience, the «th 
largest eigenvalue of a matrix M is denoted by Xi{M) in 
the following discussions. 

The main result of this subsection is the following theorem. 

Theorem IV.3. In Theorem II. 3, if f is a minimum constraint 
function, then 



tr cov 



e,e 



>tr(jt). 



(17) 



Furthermore, equality can be achieved by choosing the con- 
straint function f as in Theorem III.l. 

Note that the trace of a covariance matrix is the sum of the 
variances of the elements of 0. In this way, we have proved that 
the Moore-Penrose generalized inverse of the FIM is the CRB 
corresponding to the minimum variance among all choices of 
minimum constraint functions. 

Theorem IV.3 is in fact a corollary of the following theorem. 

Theorem IV.4. Let the SVD of a m-by-m nonnegative definite 
matrix J with rank n be 



J=\ Us Un 







uj 



(18) 



where S is a n-by-n diagonal matrix. Then 

K{V {V^JVy' V^) > X^iUs {UjJUs 
= A,(jt) Vz 



U 



(19) 



for any matrix V with the same size as Us and V V = I. 

If the above theorem holds, then Theorem IV.3 can be 
proved as follows. 

Proof for Theorem IV.3: Note that J is a nonnegative def- 
inite matrix, and the resulting U for every minimum constraint 
/ should have the same size as Us in Theorem IV.4, so the 
above theorem applies. Noting that U (U'^JU) U'^ ~ J^ 
according to Lemma III.l, the corollary follows because trace 
equals to the sum of eigenvalues. ■ 

See Appendix for the proof of Theorem IV.4. 

V. Conclusions 

We have proved the main theorem of this paper: The Moore- 
Penrose generalized inverse of a singular FIM is the CRB 
corresponding to the minimum variance among all choices 
of minimum constraint functions. According to the theorem, 
the logical validity of using the Moore-Penrose generalized 
inverse of a singular FIM as a CRB is justified, and a CRB 
for the joint design of the unbiased estimator and the constraint 
function is obtained. In additional to a performance bound, we 
also derive the sufficient condition for a constraint function to 
achieve the bound, which is a linear function of the parameter 
to be estimated. The above results facilitates future research on 
the optimal joint design of constraint functions and unbiased 
estimators. 



Appendix 

The proof is mainly based on Poincare separation theorem 
and a lemma. We first show Poicare seperation theorem below. 

Theorem A.l (Poincare separation theorem). Let A e M"^" 
be a Hermitian matrix, and let U € R"^"" satisfy U'^U = I. 
Define Br = U^AU. Then 



K(Br)<K{A) 

for fl/Z fc G {1, . . . , r}. 

Proof See [17]. 
Then we prove the following lemma. 

Lemma A.l. For any nonnegative definite matrix M £ 
and any matrix V G K™^", m > n, with V'^V = I, 

X,{VMV^) - A,(M) 



(20) 



(21) 
(22) 



for all i lE {1, . ■ . , n}, and 

X^iVMV'^) = 

for fl// i G {n + 1, . . . , m}. 

Proof: Define V = V for notational convenience. By the 
definition of V, there exist a matrix V such that [ V V ] 
is a unitary matrix. 

— — T 

Let the SVD of M be UHU . We can construct a m-hy-m 
unitary matrix as 



U^ 



U 
I 



(23) 



and we have 



VMV 

V ' 



— T 

u 







/■ 






[/SC/^ 









'v^' 











v^ 


r 




" U 




so 




/ 




J 














'v^' 




/^ 




v^ 





(24) 



Note that the product of two unitary matrices are also a uni- 
tary matrix. Therefore (24) is the SVD of the matrix VMV . 
Since VMV is a nonnegative definite matrix, we can infer 
from its SVD that its eigenvalues are Ai(Af ), . . . , A„(7W) and 
(to — n) zeroes, and the theorem follows. ■ 

Now we are able to prove Theorem IV.4. 

Proof for Theorem IV.4: By Lemma A.l, we know that 



= 

(25) 

(26) 
(27) 



A, (y {V^JV) ' V^) = A, [Us {U^JUs) ' U 
for i G {n + 1, n + 2, . . . , to,}, and 

k{v{v^jv)-'v'^)=k({v'^jv) 

A, (Us (C/JJC/,)"' C/j) = A, ((C/JJC/, 
for i G {1,2,..., n}, so it suffices to prove 

A, {{V'^JVy^) > A, (([/JJC/,)" 



(28) 



for i <E {n + l,n + 2, . . . , m}, or equivalently, 

A, (V^JV) < A, {U]^JUs) . (29) 

Noting that A^ {Uj JUg) ~ \ (J) because they have the 
same first n eigenvalues, and by the fact that a nonnegative 
definite matrix is always Hermitian, we can see (29) is just 
a result of Poicare separation theorem. Therefore the theorem 
follows. ■ 
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